Goto

Collaborating Authors

 diagnostic system


A Trustworthy Industrial Fault Diagnosis Architecture Integrating Probabilistic Models and Large Language Models

wu, Yue

arXiv.org Artificial Intelligence

Abstract: Addressing the core problem of insufficient trustworthiness in industrial fault diagnosis, stemming from the limitations of existing methods -- both traditional and deep learning - based -- in terms of interpretability, generalization, and uncertainty quantification, this paper proposes a trustworthy industrial fault diagnosis architecture, the Hierarchical Cognitive Arbitration Architecture (HCAA), which integrates probabilistic models with Large Language Models (LLMs). The architecture conducts a preliminary analysis via a diagnostic engine based on a Bayesian network and features an LLM - driven cognitive arbitration module with multimodal input capabilities. This module performs expert - level arbitration on the initial diagnosis by analyzing structured features and diagnostic charts, holding the priority to make the final decision upon detecting conflicts. To ensure the reliability of the system's output, the architecture integrates a confidence calibration module based on Temperature Scaling and a risk assessment module, which objectively quantify system trustworthiness using metrics like Expected Calibration Error (ECE). Experimental results on a dataset containing multiple fault types demonstrate that the proposed framework improves diagnostic accuracy by over 28 percentage points compared to baseline models, while the post - calibration ECE is reduced by more than 75%. Case studies confirm that the HCAA effectively corrects misjudgments from traditional models caused by complex feature patterns or knowledge gaps, providing a novel and practical engineering solution for building high - trust, explainable AI diagnostic systems for industrial applications. Keywords: Industrial Fault Diagnosis; Large Language Model (LLM); Hierarchical Cognitive Arbitration; Probabilistic Model; Confidence Calibration; Trustworthy AI 1. Introduction With the deep development of Industry 4.0 and smart manufacturing concepts, modern industrial systems are evolving towards high levels of automation and intelligence. In this process, the reliability and safety of equipment have become key factors determining production efficiency and operational costs. Prognostics and Health Management (PHM), as a core technology, plays an indispensable role in improving equipment reliability, reducing unplanned downtime, and optimizing maintenance costs by monitoring equipment status in real - time, diagnosing potential faults, and predicting remaining useful life [1], [2].


How to Evaluate Medical AI

Kopanichuk, Ilia, Anokhin, Petr, Shaposhnikov, Vladimir, Makharev, Vladimir, Tsapieva, Ekaterina, Bespalov, Iaroslav, Dylov, Dmitry V., Oseledets, Ivan

arXiv.org Artificial Intelligence

The integration of artificial intelligence (AI) into medical diagnostic workflows requires robust and consistent evaluation methods to ensure reliability, clinical relevance, and the inherent variability in expert judgments. Traditional metrics like precision and recall often fail to account for the inherent variability in expert judgments, leading to inconsistent assessments of AI performance. Inter-rater agreement statistics like Cohen's Kappa are more reliable but they lack interpretability. We introduce Relative Precision and Recall of Algorithmic Diagnostics (RPAD and RRAD) - a new evaluation metrics that compare AI outputs against multiple expert opinions rather than a single reference. By normalizing performance against inter-expert disagreement, these metrics provide a more stable and realistic measure of the quality of predicted diagnosis. In addition to the comprehensive analysis of diagnostic quality measures, our study contains a very important side result. Our evaluation methodology allows us to avoid selecting diagnoses from a limited list when evaluating a given case. Instead, both the models being tested and the examiners verifying them arrive at a free-form diagnosis. In this automated methodology for establishing the identity of free-form clinical diagnoses, a remarkable 98% accuracy becomes attainable. We evaluate our approach using 360 medical dialogues, comparing multiple large language models (LLMs) against a panel of physicians. Large-scale study shows that top-performing models, such as DeepSeek-V3, achieve consistency on par with or exceeding expert consensus. Moreover, we demonstrate that expert judgments exhibit significant variability - often greater than that between AI and humans. This finding underscores the limitations of any absolute metrics and supports the need to adopt relative metrics in medical AI.


FusionMAE: large-scale pretrained model to optimize and simplify diagnostic and control of fusion plasma

Yang, Zongyu, Yang, Zhenghao, Tian, Wenjing, Li, Jiyuan, Sun, Xiang, Zheng, Guohui, Liu, Songfen, Wu, Niannian, Li, Rongpeng, Xu, Zhaohe, Li, Bo, Shi, Zhongbing, Gao, Zhe, Chen, Wei, Ji, Xiaoquan, Xu, Min, Zhong, Wulyu

arXiv.org Artificial Intelligence

In magnetically confined fusion device, the complex, multiscale, and nonlinear dynamics of plasmas necessitate the integration of extensive diagnostic systems to effectively monitor and control plasma behaviour. The complexity and uncertainty arising from these extensive systems and their tangled interrelations has long posed a significant obstacle to the acceleration of fusion energy development. In this work, a large-scale model, fusion masked auto-encoder (FusionMAE) is pre-trained to compress the information from 88 diagnostic signals into a concrete embedding, to provide a unified interface between diagnostic systems and control actuators. Two mechanisms are proposed to ensure a meaningful embedding: compression-reduction and missing-signal reconstruction. Upon completion of pre-training, the model acquires the capability for 'virtual backup diagnosis', enabling the inference of missing diagnostic data with 96.7% reliability. Furthermore, the model demonstrates three emergent capabilities: automatic data analysis, universal control-diagnosis interface, and enhancement of control performance on multiple tasks. This work pioneers large-scale AI model integration in fusion energy, demonstrating how pre-trained embeddings can simplify the system interface, reducing necessary diagnostic systems and optimize operation performance for future fusion reactors.


Accurate Diagnosis of Respiratory Viruses Using an Explainable Machine Learning with Mid-Infrared Biomolecular Fingerprinting of Nasopharyngeal Secretions

Zhang, Wenwen, Tang, Zhouzhuo, Feng, Yingmei, Yu, Xia, Wang, Qi Jie, Lin, Zhiping

arXiv.org Artificial Intelligence

Accurate identification of respiratory viruses (RVs) is critical for outbreak control and public health. This study presents a diagnostic system that combines Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy (ATR-FTIR) from nasopharyngeal secretions with an explainable Rotary Position Embedding-Sparse Attention Transformer (RoPE-SAT) model to accurately identify multiple RVs within 10 minutes. Spectral data (4000-00 cm-1) were collected, and the bio-fingerprint region (1800-900 cm-1) was employed for analysis. Standard normal variate (SNV) normalization and second-order derivation were applied to reduce scattering and baseline drift. Gradient-weighted class activation mapping (Grad-CAM) was employed to generate saliency maps, highlighting spectral regions most relevant to classification and enhancing the interpretability of model outputs. Two independent cohorts from Beijing Youan Hospital, processed with different viral transport media (VTMs) and drying methods, were evaluated, with one including influenza B, SARS-CoV-2, and healthy controls, and the other including mycoplasma, SARS-CoV-2, and healthy controls. The model achieved sensitivity and specificity above 94.40% across both cohorts. By correlating model-selected infrared regions with known biomolecular signatures, we verified that the system effectively recognizes virus-specific spectral fingerprints, including lipids, Amide I, Amide II, Amide III, nucleic acids, and carbohydrates, and leverages their weighted contributions for accurate classification.


2D Integrated Bayesian Tomography of Plasma Electron Density Profile for HL-3 Based on Gaussian Process

Wang, Cong, Yang, Renjie, Li, Dong, Yang, Zongyu, Wang, Zhijun, Wei, Yixiong, Li, Jing

arXiv.org Artificial Intelligence

This paper introduces an integrated Bayesian model that combines line integral measurements and point values using Gaussian Process (GP). The proposed method leverages Gaussian Process Regression (GPR) to incorporate point values into 2D profiles and employs coordinate mapping to integrate magnetic flux information for 2D inversion. The average relative error of the reconstructed profile, using the integrated Bayesian tomography model with normalized magnetic flux, is as low as 3.60*10^(-4). Additionally, sensitivity tests were conducted on the number of grids, the standard deviation of synthetic diagnostic data, and noise levels, laying a solid foundation for the application of the model to experimental data. This work not only achieves accurate 2D inversion using the integrated Bayesian model but also provides a robust framework for decoupling pressure information from equilibrium reconstruction, thus making it possible to optimize equilibrium reconstruction using inversion results.


ONION: Physics-Informed Deep Learning Model for Line Integral Diagnostics Across Fusion Devices

Wang, Cong, Yang, Weizhe, Wang, Haiping, Yang, Renjie, Li, Jing, Wang, Zhijun, Yu, Xinyao, Wei, Yixiong, Huang, Xianli, Liu, Zhaoyang, Zou, Changqing, Zhao, Zhifeng

arXiv.org Artificial Intelligence

This paper introduces a Physics-Informed model architecture that can be adapted to various backbone networks. The model incorporates physical information as additional input and is constrained by a Physics-Informed loss function. Experimental results demonstrate that the additional input of physical information substantially improve the model's ability with a increase in performance observed. Besides, the adoption of the Softplus activation function in the final two fully connected layers significantly enhances model performance. The incorporation of a Physics-Informed loss function has been shown to correct the model's predictions, bringing the back-projections closer to the actual inputs and reducing the errors associated with inversion algorithms. In this work, we have developed a Phantom Data Model to generate customized line integral diagnostic datasets and have also collected SXR diagnostic datasets from EAST and HL-2A. The code, models, and some datasets are publicly available at https://github.com/calledice/onion. Keywords: PINN; Deep learning; Tokamak; EAST; HL-2A; Soft x-rays


SynFundus: A synthetic fundus images dataset with millions of samples and multi-disease annotations

Shang, Fangxin, Fu, Jie, Yang, Yehui, Huang, Haifeng, Liu, Junwei, Ma, Lei

arXiv.org Artificial Intelligence

In the field of medical imaging, there are seldom large-scale public datasets with high-quality annotations due to data privacy and annotation cost. To address this issue, we release SynFundus-1M, a high-quality synthetic dataset containing over \textbf{1 million} fundus images w.r.t. 11 disease types. Moreover, we intentionally diversify the readability of the images and accordingly provide 4 types of the quality score for each image. To the best of our knowledge, SynFundus-1M is currently the largest fundus dataset with the most sophisticated annotations. All the images are generated by a Denoising Diffusion Probabilistic Model, named SynFundus-Generator. Trained with over 1.3 million private fundus images, our SynFundus-Generator achieves significant superior performance in generating fundus images compared to some recent related works. Furthermore, we blend some synthetic images from SynFundus-1M with real fundus images, and ophthalmologists can hardly distinguish the synthetic images from real ones. Through extensive experiments, we demonstrate that both convolutional neural networs (CNN) and Vision Transformer (ViT) can benefit from SynFundus-1M by pretraining or training directly. Compared to datasets like ImageNet or EyePACS, models trained on SynFundus-1M not only achieve better performance but also faster convergence on various downstream tasks.


Designing a Deep Learning-Driven Resource-Efficient Diagnostic System for Metastatic Breast Cancer: Reducing Long Delays of Clinical Diagnosis and Improving Patient Survival in Developing Countries

Gao, William, Wang, Dayong, Huang, Yi

arXiv.org Artificial Intelligence

Breast cancer is one of the leading causes of cancer mortality. Breast cancer patients in developing countries, especially sub-Saharan Africa, South Asia, and South America, suffer from the highest mortality rate in the world. One crucial factor contributing to the global disparity in mortality rate is long delay of diagnosis due to a severe shortage of trained pathologists, which consequently has led to a large proportion of late-stage presentation at diagnosis. The delay between the initial development of symptoms and the receipt of a diagnosis could stretch upwards 15 months. To tackle this critical healthcare disparity, this research has developed a deep learning-based diagnosis system for metastatic breast cancer that can achieve high diagnostic accuracy as well as computational efficiency. Based on our evaluation, the MobileNetV2-based diagnostic model outperformed the more complex VGG16, ResNet50 and ResNet101 models in diagnostic accuracy, model generalization, and model training efficiency. The visual comparisons between the model prediction and ground truth have demonstrated that the MobileNetV2 diagnostic models can identify very small cancerous nodes embedded in a large area of normal cells which is challenging for manual image analysis. Equally Important, the light weighted MobleNetV2 models were computationally efficient and ready for mobile devices or devices of low computational power. These advances empower the development of a resource-efficient and high performing AI-based metastatic breast cancer diagnostic system that can adapt to under-resourced healthcare facilities in developing countries. This research provides an innovative technological solution to address the long delays in metastatic breast cancer diagnosis and the consequent disparity in patient survival outcome in developing countries.


On the Impact of Voice Anonymization on Speech-Based COVID-19 Detection

Zhu, Yi, Imoussaïne-Aïkous, Mohamed, Côté-Lussier, Carolyn, Falk, Tiago H.

arXiv.org Artificial Intelligence

With advances seen in deep learning, voice-based applications are burgeoning, ranging from personal assistants, affective computing, to remote disease diagnostics. As the voice contains both linguistic and paralinguistic information (e.g., vocal pitch, intonation, speech rate, loudness), there is growing interest in voice anonymization to preserve speaker privacy and identity. Voice privacy challenges have emerged over the last few years and focus has been placed on removing speaker identity while keeping linguistic content intact. For affective computing and disease monitoring applications, however, the paralinguistic content may be more critical. Unfortunately, the effects that anonymization may have on these systems are still largely unknown. In this paper, we fill this gap and focus on one particular health monitoring application: speech-based COVID-19 diagnosis. We test two popular anonymization methods and their impact on five different state-of-the-art COVID-19 diagnostic systems using three public datasets. We validate the effectiveness of the anonymization methods, compare their computational complexity, and quantify the impact across different testing scenarios for both within- and across-dataset conditions. Lastly, we show the benefits of anonymization as a data augmentation tool to help recover some of the COVID-19 diagnostic accuracy loss seen with anonymized data.


Using artificial intelligence to diagnose cancer

#artificialintelligence

During her Ph.D., Dr. Qurrat Ul Ain developed a computer-aided diagnostic system that can identify certain characteristics of the disease from a photograph of a skin lesion. "Skin cancer has certain unique visual features that help to differentiate it from normal skin," Dr. Qurrat Ul Ain says. "These include color, texture, and the shape of lesions. By showing our artificial intelligence program images of cancerous skin, we were able to teach it to identify cancer when shown other photographs." Dr. Qurrat Ul Ain's diagnostic system achieved a 100% accuracy rating in identifying images of melanoma based on the more than 600 images tested so far.